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Abstract. Structure of real networked systems, such as social relationship, can be modeled as temporal 
networks in which each edge appears only at the prescribed time. Understanding the structure of temporal 
networks requires quantifying the importance of a temporal vertex, which is a pair of vertex index and time. 
In this paper, we define two centrality measures of a temporal vertex based on the fastest temporal paths 
which use the temporal vertex. The definition is free from parameters and robust against the change in 
time scale on which we focus. In addition, we can efficiently compute these centrality values for all temporal 
vertices. Using the two centrality measures, we reveal that distributions of these centrality values of real- 
world temporal networks are heterogeneous. For various datasets, we also demonstrate that a majority of 
the highly central temporal vertices are located within a narrow time window around a particular time. 
In other words, there is a bottleneck time at which most information sent in the temporal network passes 
through a small number of temporal vertices, which suggests an important role of these temporal vertices 
in spreading phenomena. 

PACS. 89.75.Fb Structures and organization in complex systems - 89.75.He Networks and genealogical 
trees - 64.60.aq Networks 


1 Introduction 

Complex networks such as social networks, information 
networks, and biological networks have been intensively 
studied in the past decade to understand their behavior 
under certain dynamics and develop efficient algorithms 
for them. See [1-4] for extensive surveys. 

However, many real-world networks are actually tem¬ 
poral networks [5,6], in which a vertex communicates with 
another vertex at specific time over finite duration. For 
example, social interaction between individuals, passen¬ 
ger flow between cities, and synaptic transmission between 
neurons can be represented as temporal networks. When 
we assume that the focal dynamical processes on net¬ 
works, such as information propagation, occur on a time 
scale comparable to the change in network structure, a 
temporal-network representation gives us a precise way to 
capture the processes. We can describe the advantage of 
working with a temporal network using the example shown 
in Fig. 1. This temporal network consists of four vertices 
and eight edges, each of which has the time it appears. Let 
us assume that it takes unit time to send the information 
from the tail to the head of an edge. For example, suppose 
that the information starts to propagate from vi at time 1. 
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Then, it reaches V 2 at time 2 through edge {vi,V 2 ), waits 
at V 2 till time 3, then reaches V 3 at time 4 through edge 
{v2, V3). The information never reaches V4 because the only 
edge incoming to V 4 is (v 2 ,V 4 ) which appears at time 1, 
and V 2 does not have the information at that time. How¬ 
ever, if we ignore the temporal information and regard the 
network as a static directed network, we mistakenly reach 
the conclusion that information in vi at time 1 can reach 
V4 because there is a directed path from vi to V4. There¬ 
fore, we cannot dismiss temporal information to properly 
understand the structure of temporal networks. 

An important notion studied to understand the struc¬ 
ture of (static) networks is vertex centrality, which mea¬ 
sures the importance of a vertex. The following reasons 
motivate the study of centralities. First, we can use cen¬ 
tralities to find important vertices in several applications 
such as suppressing the epidemics [7,8] or maximizing the 
spread of influence [9]. Second, we can use them to under¬ 
stand the structure of real-world networks by examining 
the difference between the distributions of the centrality 
values in such networks and in the randomized networks 
(e.g, [10,11]). Third, we can examine the validity of gen¬ 
erative network models by investigating the distribution 
of centralities of the generated network (e.g., [12,13]). 

Hence, it is natural to study centralities for temporal 
networks. Since the most fundamental difference between 
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Fig. 1. Schematic of an example of temporal network. The 
nnmber associated with each edge represents the time at which 
the edge appears. 


a static network and a temporal network is that the latter 
involves time, we define the centrality of a vertex at a 
specific time. To distinguish from a vertex, we call the pair 
of a vertex and time a temporal vertex. In the literature, 
multiple centrality notions of temporal vertices based on 
temporal paths [5] have been proposed. Examples include 
the generalizations of the centrality notions to temporal 
networks, such as betweenness [15-18], closeness [16,17, 
19], communicability [20-22], efficiency [14], random-walk 
centrality [23], and win-lose score [24] (see Ref. [25] for a 
review of some of them). However, each previons centrality 
notion suffers from at least one of the following two issues: 

1. We need to carefully set parameter values and (or) the 

time interval within which we consider temporal paths. 

2. It is inefficient to compute the centrality. 

For the first issue, the time interval length especially 
requires careful tuning; if the time interval is too wide, 
then the centrality of a temporal vertex v becomes negli¬ 
gible because most of the paths finish before or start after 
V appears. By contrast, if the time interval is too narrow, 
again the centrality of v becomes negligible because paths 
can pass by only a tiny fraction of vertices in the time 
interval. It should be noted that our centrality measures 
are free from any parameters not because we consider the 
centrality of temporal vertex. The centrality measures of 
a temporal vertex proposed in the previous work [14-25] 
require some parameters for different reasons. Our central¬ 
ity measures get around this issue by focusing on the lo¬ 
cal structure of temporal paths around the focal temporal 
vertex. For the second issue, even if we compromise to use 
an approximation, computing the approximated centrality 
value of a single temporal vertex requires computational 
time at least linear to network size [26]. 

In this paper, we propose two novel centrality notions 
for temporal networks that resolve these issues. The first 
one, called temporal coverage centrality (TCC), measures 
the fraction of pairs of (normal) vertices that have at least 
one fastest temporal path that uses the focal temporal ver¬ 
tex. The second one, called temporal boundary coverage 
centrality (TBCC), measures the fraction of pairs of ver¬ 
tices that have a unique fastest temporal path, which uses 
the focal temporal vertex. 


Our centrality notions address the two issues described 
above in the following way. For the first issue, TCC and 
TBCC are free from setting of any parameters or time 
interval. To calculate the TCC or TBCC value of a tem¬ 
poral vertex v = (v, r), we only have to run over all pairs 
of vertices {u,w). Namely, we consider temporal vertices 
u = (u, T„) and w = (w, r^), where r„ is the latest time at 
which we can send information from u so that it reaches 
V at time r, and r^, is the earliest time at which we can 
receive information at w that is sent from v at time r. It 
should be noted that, if we fix focal temporal vertex v, Tu 
and Tu, are uniquely determined by u and w, respectively, 
and that we thus do not have to care about the time in¬ 
terval around v. Then, we check whether the information 
sent from u = (u,Tu) to w = {w,Tw) can or should drop 
by V. 

For the second issue, although the definitions of TCC 
and TBCC might look complicated and hard to compute, 
this is not the case. Indeed, computing TCC and TBCC 
can be reduced to the problem of deciding whether or not 
there is a directed path between queried vertices in an 
associated directed network (see Section 2.2 for details). 
The latter problem is well studied in the database com¬ 
munity [38-42], and it can be solved by constructing an 
index of the directed network, which computes the reach¬ 
ability between any pair of nodes by using information 
of the reachability between a fraction of node pairs. If it 
suffices to use approximations to the TCC and TBCC val¬ 
ues, we only need to query the index at most 0(log^ N) 
times, where N is the total number of vertices in the net¬ 
work (see Appendix A). Since we can efficiently process 
queries to the index in practice, this method is advan¬ 
tageous compared to the 0{N) time for approximating 
previous centrality notions. 

With the aid of our centrality notions, we are able to 
compute the centrality of all temporal vertices in a tem¬ 
poral network and analyze the statistics of the whole net¬ 
work. Using TBCC, we demonstrate that real-world tem¬ 
poral networks have a small number of temporal vertices 
without which information propagates more slowly. Sur¬ 
prisingly, we reveal that the temporal vertices of large cen¬ 
trality values form a narrow time region, and this time re¬ 
gion seemingly corresponds to the beginning or the end of 
a time interval in which temporal edges occur in a bursty 
manner. In addition, by using TCC, we show that the re¬ 
maining part of the temporal network is highly redundant 
in the sense that there are many ways to send informa¬ 
tion as quickly as possible. Although these properties are 
recognized in the network science community [28-30], we 
quantitatively confirm it for the first time using our cen¬ 
trality notions. We also demonstrate that the removal of 
temporal vertices according to their TBCC values is effec¬ 
tive for hindering the propagation of information for both 
delaying and stopping it. 

The paper is organized as follows. In Section 2, we 
introduce basic notions of temporal networks and the di¬ 
rected network associated with a temporal network. Sec¬ 
tion 3 introdnces our centrality notions for temporal ver¬ 
tices, and Section 4 explains detailed methods of com- 
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puting our centrality notions. Section 5 is dedicated to 
demonstrating our experimental results. We give the con¬ 
clusion in Section 6 . 


2 Preliminaries about temporal networks 

2.1 Basic notions 

We introduce the terminology and symbols to describe 
temporal network structure, which basically follow those 
used in Ref. [31]. 

For integer k, let [/c] denote the set {1,2,..., k}. We 
define K+ as the set of non-negative real numbers. 

Let V be the set of vertices. A temporal edge is rep¬ 
resented by quadruplet e = (m, v,t. A), where u,v € V, 
r G M, and A € ffi+. For temporal edge e = (u, u,r. A), 
we refer to t. A, and r -I- A as the starting time, the du¬ 
ration, and the ending time of e, respectively. Temporal 
network G = {V,E) is a pair of set of vertices V and set 
of temporal edges E. 

When we study temporal networks, a vertex at a cer¬ 
tain time is of interest. Therefore, we define a temporal 
vertex by a pair of vertex v G V and time r G M. In the 
following, we always use bold symbols such as v to de¬ 
note temporal vertices. For temporal vertex v = (u,t), 
we denote the time r by t{v). 

Temporal path P in temporal network G = (V, E) is 
defined as an alternating sequence of temporal vertices 
and edges P = (t)i, ei, 62 ,..., e^-i, satisfying the 
following properties. Let Vi = (vi, Ti) for each t G [fc]. Then 
for each i G [A: — 1], the z-th temporal edge is of the form 
Ci = (ui, Ui+i, r. A) such that < r and r -|- A < r^+i. We 
define the starting time, the duration, and the ending time 
of P as Ti, Tfc — Ti, and Tfc, respectively. For two temporal 
vertices u and v, relationship u v indicates that there 
is a temporal path from u io v. 

We define the earliest arrival time at vertex w when 
departing from temporal vertex v by the smallest r G M 
such that V and we denote it by Tea.tiv,w). If 

there is no such r, we define Teativ,w) = 00 . Similarly, 
we define the latest departure time from a vertex u for 
arriving at v as the largest r G M such that {u, r) 
v, and we denote it by Ti^t(v,u). If there is no such t, 
we define T\dtiv,u) = — 00 . A fastest temporal path from 
temporal vertex v to vertex zu is a temporal path from 
V to (zu, Teat('U, zc)), and a fastest temporal path from a 
vertex zz to a temporal vertex z; is a temporal path from 
(zz,Tidt('W,zz)) to V. 

2.2 Directed acyclic graph representation 

A directed acyclic graph (DAG) is a directed network with 
no directed cycle. In this section, we describe the DAG 
representation of a temporal network, which is useful when 
solving problems related to temporal paths and describing 
the centrality notions we will introduce in Section 3. This 
DAG representation and its variants have been considered 
in the analysis of temporal networks [17,32-36]. 
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Fig. 2. DAG representation of the temporal network shown in 
Fig. 1. 

For temporal network G = {V,E), the DAG repre¬ 
sentation of G, denoted by G = {V,E), is constructed 
as follows. A vertex in G represents a temporal vertex in 
G. For each v G V, we first add to V two vertices cor¬ 
responding to the temporal vertices (v, — 00 ) and (z’, 00 ). 
For each temporal edge {u,v,t,X) G E, we add to V two 
vertices corresponding to temporal vertices u = (zz, t) and 
V = {v,T + X) (if they do not exist in V) and add edge 
(zt, v) to E. Finally, for each pair of temporal vertices 
u = (zz, r), zz' = (zz, r') sharing the same vertex zz, we add 
edge (zz, zz') to E if there is no temporal vertex of the form 
{u,t") in V such that t < t" < t'. 

Figure 2 illustrates DAG representation G of temporal 
network G shown in Fig. 1. The vertex in the z-th row and 
the j-th column corresponds to the temporal vertex (yi,j). 
For example, since there is temporal edge (ui,z' 2 ,1,1) in 
G, we have an edge from (zzi, 1) to (v 2 ,2) in G. For the z- 
th row, the leftmost and rightmost vertices correspond to 
the temporal vertices (vi,—oo) and (z;i,oo), respectively. 

From the construction of the DAG representation, we 
have the following useful properties: 

Lemma 1 Let G be a temporal network. Then, G is a 
DAG. 

Proof This is clear as we only add edges of the form 
((zz, r), {v, r')), where t < r'. 

Lemma 2 Let G be a temporal network. Suppose that 
temporal vertices zz and v have corresponding vertices in 
G. Then, there is a temporal path from u to v in G if and 
only if there is a directed path from zz to v in G. 

Proof Let P = {vi,ei,V 2 , ■ ■ ■ ,ek-i,Vk) be a temporal 
path from zzi = zz to zz^ = zz. Without loss of general¬ 
ity, we assume that the time of zz^ is equal to the starting 
time of Ci or the ending time of zzi_i. Then, each zz^ has 
a corresponding vertex in G. Let zz^ = (vi,Tf) for each 
z G [fc] and Ci = (zz^, zz^+i, r®, A{) for each z G [A: — 1]. 
Then, there is a directed path (ui, r{), (zzi, t{), (z; 2 , + 

K)^ii’ 2 ,T^),iv 2 ,Tf),{v 2 ,T^ + Xl),...,{vk,Tf) in G. The 
converse easily follows the correspondence explained 
above. 
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Fig. 3. Schematic describing the concept of temporal coverage 
centrality. The dashed polygonal lines represent the two tem¬ 
poral paths from vertex V 4 to V 2 that contain temporal vertex 
V in their durations. 

3 Temporal coverage centralities 

In this section, we introduce the temporal coverage cen¬ 
trality and the temporal boundary coverage centrality. 

3.1 Temporal coverage centrality 

Before defining TCC, we define the notion of coverage 
in temporal networks by generalizing its original version 
in static networks [37] as follows. Let v be a temporal 
vertex and u,w be vertices. Let u = (m, u)) and 

w = (iu,Tea.t('i>,w)). Then, we say that v covers node pair 
(u, w) if the following two conditions hold: 

1. Teat (m, w) = Teat (w,w), 

2. = Tidt{v,u). 

In words, the earliest arrival time at w when departing 
from u does not change even if we drop by v (condition 
1 ), and the latest departure time from u for arriving at 
w does not change even if we drop by v (condition 2 ). 
Figure 3 explains condition 1. Let us focus on v = (r;i, 7). 
Then, temporal vertices u = (^ 4 , Tidt('i’,’^ 4 )) = (w 4 , 4 ) and 
w = (u 2 , Teat(i’, U 2 )) = (^2,9) are determined as shown in 
the figure. We observe that, if we depart from u and are 
not forced to drop by v, we can arrive at w' = (r! 2 , 8 ), 
which is earlier than w. Hence, node pair (u, w) is not 
covered by v but by w'. 

On the basis of this notion of coverage, the TCC value 
of V is defined as the fraction of pairs {u,w) G V x V 
that are covered by v. By definition, the TCC value of a 
temporal vertex takes a real number in [0,1]. If the TCC 
value is close to unity, the temporal vertex is said to be 
central in the sense that it covers many pairs of nodes. The 
formal definition is given in Algorithm 1 in an algorithmic 
manner. 


3.2 Temporal boundary coverage centrality 

Let V = {v,t) be a temporal vertex and u,w be vertices. 
Let u = (u, Tidt('W, u)) and w = (w, Teat('U, w))- Even if 
the TCC value of v is large, it does not always imply that 
removing the temporal edges involving v makes Teat(M, w) 
larger or Tidt(ir’, w) smaller. One particular reason for this 


Algorithm 1 (The TCC value of v) 

1: r 0. 

2: for u GV and w GV do 
3: u ^ (M,ridt(u,u)). 

4: W (w,Tea.t(v,w)). 

5: if Tea,t(u,w) = t{w) and ridt(ry,M) = r{u) then 

6: r •<— r -I- 1. 

7: return r/|yp. 


is that sometimes we can reach v from u earlier than r and 
can leave v later than r to reach w (see temporal vertices 
V 2 and V 3 in Fig. 4). In some applications, we may want 
to regard such v as unimportant. 

To address this issue, we define TBCC by imposing ad¬ 
ditional criteria to the notion of coverage as follows. Note 
that, if focal temporal vertex v is an example of the situa¬ 
tion stated in the previous paragraph, then Tei^t{u,v) < r 
or Tidt{w, v) > T should hold. Hence, we define that a pair 
(m, w) of vertices is covered at a boundary by temporal 
vertex v if the following hold: 

1 . {u,w) is covered by v, and 

2 . reat(M,n) = r or ridt(tf,u) = r. 

We explain this definition using the example shown 
in Fig. 4. Let Vi = {v,Ti) for i G [4]. Note that all Vi 
(i € [4]) cover vertex pair (u, w) as m = (it, Tidt('i’i) w)) 
and w = {w,Teat{vi,w)) hold for all i G [4]. In addition, 
note that all Vi cover {u,w). We can see that Vi and V 4 
cover (it. If) at the boundary because Teat(M,u) = Ti and 
Tidt('ir’, v) = T 4 . By contrast, V 2 and W 3 do not cover (u, w) 
at the boundary. 

On the basis of this notion of coverage at the bound¬ 
ary, the TBCC value of v is defined as the fraction of 
pairs (it. If) that are covered at the boundary by v. Sim¬ 
ilar to TCC, the TBCC value of a temporal vertex takes 
a real number in [0,1] by definition. The formal definition 
is given in Algorithm 2 in an algorithmic manner. 

In closing this section, it should be noted the difference 
between the previous notion of the temporal betweenness 
centrality and TCC (and TBCC). The main difference lies 
in the normalization of the number of vertex pairs covered 
by the temporal vertex. The definitions of TCC and TBCC 
do not normalize the number of such vertex pairs with the 
number of the fastest temporal paths, whereas the previ¬ 
ous temporal betweenness centrality divides the number 
of the fastest paths that use the focal temporal vertex by 
the total number of the fastest temporal paths in the focal 
time window, as the betweenness centrality for static net¬ 
works does [15-18]. We took such definitions of TCC and 
TBCC for the following reasons. First, TCC and TBCC 
become free from any parameters because we do not need 
to set the time window to count the number of the rele¬ 
vant fastest temporal paths for the normalization. Second, 
the TCC and TBCC values are easy to interpret as the 
fraction of the vertex pairs that have a fastest temporal 
path using the focal temporal vertex. 
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Fig. 4. Schematic describing the concept of temporal bonnd- 
ary coverage centrality. The dashed arrows represent the tem¬ 
poral edges that do not contribute the centrality values of the 
source temporal vertices. 


Algorithm 2 (The TBCC value of v) 

1: r 0. 

2: for u £ V and w £V do 
3: u ^ (M,ridt(v,M)). 

4: W {w,Tea.t{v,w)). 

5: if Teat(u,w) = t(w) and ridt(m,u) = t(u) then 

6: if Tei,t(u,v) = t(v) or Tidt(m,w) = t{v) then 

7: r r + 1. 

retnrn r/\V\‘^. 


4 Computing temporal coverage centralities 

We can straightforwardly calculate TCC and TBCC ac¬ 
cording to Algorithms 1 and 2. In this section, to man¬ 
age large temporal networks, we give efficient methods 
for computing TCC and TBCC on the basis of a graph¬ 
indexing technique developed recently in the database 
community [27], in particular, the method proposed in 
[42]. The key idea is in how to speed up the computa¬ 
tion of Teat and Tidt in Algorithms 1 and 2. We describe 
the exact computation of TCC and TBCC in this section, 
and we also give the algorithms to approximate the TCC 
and TBCC values whose running time is polylogarithmic 
in the total number of vertices in G (see Appendix B). 

In a directed network, we say that a vertex Vt is reach¬ 
able from Vs if there is a directed path from Vs to Vt ■ With 
respect to Lemma 2, to enumerate the number of pairs 
(u, w) being covered by v (at the boundary, if needed), we 
want to efficiently answer reachability in the DAG rep¬ 
resentation G of given temporal network G. To this end, 
it is beneficial to construct an index of G that computes 
the reachability between any pair of nodes on the basis of 
information of the reachability between a fraction of node 
pairs. Such an index is often called a reachability oracle 
in the database community [38-42]. 

The basic idea of the construction of a reachability or¬ 
acle for the present problem is the following. Naively, we 
want to compute a large table that stores the reachability 
of every pair of temporal vertices. If this were possible, 
we could answer reachability just by looking at that ta¬ 
ble. Unfortunately, however, perfecting this table requires 
0 (|Up) computation time and 0(|Up) space, which could 
be prohibitively slow and large. The reachability oracle 
overcomes this problem by carefully storing partial infor¬ 


Table 1. Basic statistics of the datasets. Variables n, m, n, 
and Tmax are the total number of vertices and temporal edges 
in G, the total number of vertices in G, and the maximum 
ending time of a temporal edge, respectively. The datasets are 
arranged in increasing order of m. 


Name 

n 

m 

n 

’7”max 

Infectious [43] 

410 

17298 

32218 

1393 

HT09 [43] 

113 

20187 

48477 

5246 

Hospital [44] 

75 

32424 

65296 

9454 

Irvine [45] 

1899 

59835 

220772 

58192 

Email [46] 

167 

82927 

254533 

57843 


mation of the network. Based on the information, it effi¬ 
ciently computes the reachability for the whole network. 

The method proposed in Ref. [42], which we will use 
for the numerical experiments in Section 5, computes a 
small table for each temporal vertex that stores reachabil¬ 
ity from (and to) a smaller number of other certain tempo¬ 
ral vertices than the number of all the temporal vertices. 
It depends on the structure of each temporal network how 
small the table becomes. Then, we can answer the reach¬ 
ability from a temporal vertex tt to a temporal vertex v 
by checking whether there is another temporal vertex w 
such that we can confirm the reachability from utow and 
from w to V using the small tables of u and v. If there is 
such w, we indeed have a directed path from u to v. The 
challenging part of the construction lies in guaranteeing 
the other direction; if there is a directed path from u to 
V, then there is always such w. In addition, we need to be 
able to compute the small table for each vertex efficiently. 
This method resolves these issues, so that it can handle 
directed networks of millions of edges with the query time 
of less than a microsecond on average (see Ref. [42] for 
further technical details). 

5 Results 

The basic statistics of the datasets we use are summarized 
in Table 1. It should be noted that we do not use the 
actual time stamps in the datasets but define t by the 
order of unique values of the time stamps. For example, 
if the dataset consists of two time stamps t = 1,4, we 
translate them into r = 1, 2. In addition, we assume that 
A is equal to the finest time resolution of each dataset 
for all the temporal edges. Although interactions in Irvine 
and Email are directed (i.e., from sender to receiver(s) of 
messages), we regard them as undirected. 

5.1 Statistics of TCC and TBCC 

Figure 5 depicts the rank plots of the TCC and TBCC val¬ 
ues of temporal vertices in the decreasing order. In all the 
datasets except for the Email data, at least 10% of tempo¬ 
ral vertices have TCC values larger than 0.1 (Fig. 5(a)). 
This fact implies the redundancy of temporal networks 
in the sense that, when information flows between tem¬ 
poral vertices, it can drop by different vertices without 
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Fig. 5. Rank plots of the (a) TCC and (b) TBCC values. 



rank of temporal vertex 
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Fig. 6. Rank plots of the (a) TCC and (b) TBCC values in 
randomized temporal networks. The curves for Irvine are not 
provided because the computation did not stop. 


increasing the total duration of the temporal paths. How¬ 
ever, there are a smaller number of temporal vertices with 
large TBCC values (Fig. 5(b)). This fact also implies the 
redundancy of temporal networks in a different sense such 
that, when information flows between temporal vertices, 
it is not forced to exist at a certain vertex at a certain 
time. 

To see the impact of the structural peculiarity of tem¬ 
poral networks on these distributions, we computed the 
centrality values of temporal vertices in randomized tem¬ 
poral networks. We randomize an original temporal net¬ 
work by replacing the two ends of each temporal edge 
by vertices chosen uniformly at random (similar to the 
procedure called randomized edges with randomly per¬ 
muted times in Ref. [5]). The resultant centrality values 
are shown in Fig. 6. We notice that more temporal ver¬ 
tices have sufflciently large centrality values (e.g., larger 
than 0.1) in real-world temporal networks (Fig. 5) than 
in randomized temporal networks (Fig. 6). The maximum 
centrality values are larger in the randomized than in the 
original networks for HT09 and Hospital, and vice versa 
for Infectious and Email. This fact implies that the way 
the flow concentrates upon temporal vertices depends on 
each dataset. 

It should be noted that the calculation for the random¬ 
ized Irvine dataset did not stop even though the Email 
dataset, which has larger n than the Irvine, stopped. We 
can explain this result with the increase in the number 
of vertex pairs connected via temporal paths. The dom¬ 
inant factor of the computational time is the number of 
vertex connected via temporal paths because we have to 
consider all of such vertex pairs to calculate the centrality 
value of a temporal vertex. After the randomization, most 
of the vertex pairs are likely to have temporal paths and 
the number of such pairs scales with n^. If we take into 
account that the Irvine dataset has the largest n value 
among the five datasets we consider, it makes sense for 
the Irvine dataset to require the far longer computational 
time compared to the other four datasets. 

Next, we examine how the centrality values change 
over time owing to the structural transformation of the 
temporal networks. Figure 7 depicts the change in the 
maximum TCC and TBCC values over temporal vertices 
at present and the number of temporal vertices at present 
for Infectious and Hospital. In both datasets shown in 


Fig. 7, we can see some periodic patterns in the number 
of temporal vertices. However, the maximum centrality 
values are not much affected by the patterns, which im¬ 
plies that these values are determined not by the mere 
activity level in the networks but by the structure of the 
temporal network. In addition, the fact that the maxi¬ 
mum centrality values vary considerably throughout the 
observation periods suggests that we should carefully in¬ 
corporate temporal structure to assess the importance of 
vertices. Generally, the maximum TCC values are larger 
than the maximum TBCC values, which makes sense ac¬ 
cording to their definitions (i.e., TBCC only counts the 
coverage of the temporal paths at the boundary but TCC 
does not impose this boundary criterion). 

When we focus on a particular vertex, two centrality 
values of it also vary in a different manner over time. Fig¬ 
ure 8 depicts the change in the TCC and TBCC values of 
the vertex that are involved in the largest number of tem¬ 
poral edges in the two datasets. Infectious and Hospital. 
The TCC value of the vertex increases with time in In¬ 
fectious (Fig. 8(a)), simply because the number of present 
temporal vertices increases and thus the focal vertex can 
reach these vertices in this period (also see Fig. 7(a)). By 
contrast, the TBCC value does not exhibit such an in¬ 
creasing trend. This fact supports our original purpose of 
introducing TBCC, i.e., to discount the centrality values 
of the temporal vertices of the dispensable temporal paths. 
In addition, the plot of TBCC unveils that even the vertex 
with the largest number of temporal edges does not always 
bridge effective temporal paths. In Hospital (Fig. 8(b)), 
we can observe that the temporal edges associated with 
the focal vertex are partitioned into five time intervals, in 
each of which temporal edges occur in a bursty manner, 
and the centrality values of the vertex become larger at 
the beginning and the end of each of these time intervals. 
This observation makes sense because, at the endpoints 
of a time interval, a vertex tends to play the role as the 
gateway for information flowing into or out of the time 
interval. 

The computational efficiency of the two centralities en¬ 
ables us to draw a map of the centrality values of all the 
temporal vertices over time. This map reveals the exis¬ 
tence of bottleneck time regions in the empirical temporal 
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time time 

Fig. 7. Change in the maximum TCC and TBCC values over temporal vertices at present in (a) Infectious and (b) Hospital. 
For readability, we smoothed the curves by taking the average over a sliding window with a length of 100 units of time. 
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Fig. 8. Change in the TCC and TBCC valnes of the vertex with the largest number of temporal edges, (a) Vertex with label 195 
in Infectious and (b) vertex with label 1115 in Hospital. 


networks. Figures 9(a) and 9(b) depict the TCC values of 
temporal vertices as a heat map for Infectious and Hospi¬ 
tal, respectively. In both datasets, most temporal vertices 
have non-negligible TCC values, and these results sup¬ 
port the notion of redundancy of temporal networks (see 
Fig. 5(a)) such that all the vertices can belong to redun¬ 
dant temporal paths. In addition, the temporal vertices 
with the largest centrality values appear in the middle 
of the observation period (around time 700 and 6000 in 
Figs. 9(a) and (b), respectively), and the temporal ver¬ 
tices at the same time tend to have similar TCC values. 
We found the same phenomenon in all the datasets (see 
Electronic Supplementary Materials for the plots of the 
other datasets), and the existence of this bottleneck time 
period seems to be a common property of empirical tem¬ 
poral networks. 

If we are interested in when these bottleneck time pe¬ 
riods begin and end, we can look at the heat map of the 
TBCC values. As an example, Fig. 9(c) magnifies a bot¬ 
tleneck time period in Infectious (Fig. 9(a)) in which we 
observe many temporal vertices with the largest TCC val¬ 
ues. However, the boundary of the bottleneck period is not 
clear in the figure. Figure 9(d) shows the heat map of the 
TBCC values in the same area as shown in Fig. 9(c). As 
we observe, the TBCC values indicate the boundaries at 
r ~ 660, 680, and 750. This boundary information should 


be meaningful, for example, when we narrow the candi¬ 
dates of the vertices to be vaccinated for epidemic spread¬ 
ing on temporal networks [47-49]. 

We finally stress again that it becomes possible to com¬ 
pute these statistics and analyze the structure of temporal 
networks in such detail because of the efficient computa¬ 
tion of TCC and TBCC using the reachability oracle. 

5.2 Delay caused by removing a central temporal 
vertex 

In closing this section, to verify the relevance of the pro¬ 
posed centrality notions at the microscopic level, we briefly 
report that removing a temporal vertex with large TCC 
and TBCC values is effective in delaying the propagation 
of information. 

Let G = {V, E) be a temporal network, where V = 
{vi,V 2 , •. •, Vn}- For a temporal vertex v = {v, r), let Vi = 
{vi, Tentiv, Vi)) for each i G [n] and t' be the (unique) time 
such that V has an edge to v' = (v, t'). We say that Vi gets 
prolonged by removing v if Tf.g,t{v,Vi) becomes larger by 
removing edges incident to v (and we keep edge {v,v')). 
In a similar manner, we say that Vi becomes disconnected 
by removing v if we cannot reach Vi from v after removing 
edges incident to v (where, again, we keep edge (v,v')). 
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Fig. 9. Heat maps of the TCC values for (a) Infectious and (b) Hospital, (c) Heat map magnifying the area with 650 < t < 800 
and 100 < ID < 220 in (a), (d) Heat map of the TBCC values in the same area as shown in (c). 


We investigate the fraction of prolonged or discon¬ 
nected temporal vertices among Vi,V 2 , ■. ■ ,Vn, by remov¬ 
ing one of the top 100 vertices with respect to the TCC 
or TBCC values. It should be noted that the fraction of 
temporal vertices becoming prolonged or disconnected is 
nontrivial because the definition of TCC and TBCC take 
into account temporal paths both before and after the fo¬ 
cal temporal vertex. As a baseline for comparison, we also 
conduct the same test by removing a temporal vertex cho¬ 
sen randomly. For the random case, we randomly choose 
100 temporal vertices without replacement and take the 
average of the fraction of prolonged or disconnected tem¬ 
poral vertices for these 100 trials. 

The results of the removal test of temporal vertices 
are summarized in Table 2 for the five datasets. As we 
expected, the removals according to the largest centrality 
values make more temporal vertices prolonged or discon¬ 
nected than the random removals. The removals according 
to the largest TCC values tend to prolong a certain frac¬ 
tion of temporal vertices for all the datasets considered. 
However, it makes few temporal vertices disconnected. 
These outcomes make sense because the number of other 
temporal paths running alongside the temporal path go¬ 
ing through the focal temporal vertex is not considered in 
TCC (also see Section 3.1). By contrast, the removals ac¬ 
cording to the largest TBCC values make a considerable 
fraction of temporal vertices prolonged and disconnected. 


Remarkably, 50.8% of the temporal vertices, on average, 
become disconnected from a removed temporal vertex in 
Irvine. There is no clear distinction between the results of 
the offline (i.e.. Infectious, HT09, and Hospital) and online 
(i.e., Irvine and Email) networks. 


6 Conclusions 

We introduced two centrality notions for temporal 
networks—temporal coverage centrality and temporal 
boundary coverage centrality—to represent the impor¬ 
tance of a temporal vertex by the fraction of vertex pairs 
that can or should use the temporal vertex when sending 
information as quickly as possible. Compared to centrality 
notions proposed in previous work, TCC and TBCC have 
two advantages: (i) Parameters or time windows do not 
need to be set and (ii) computation time is reasonable. 

Applying TCC and TBCC to multiple datasets of em¬ 
pirical temporal networks, we revealed that there tends to 
be particular bottleneck time periods that play a crucial 
role in propagating information quickly and that the rest 
of the networks is redundant in the sense that there are 
many temporal paths to send information with the same 
duration. Although such structural redundancy in tempo¬ 
ral networks was suggested in some previous studies [28- 
30], our centrality notions enable us to clearly quantify 
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Table 2. Results of the removal of temporal vertices. The number in each cell presents the average fraction of disconnected (or 
prolonged) temporal vertices over the 100 trials of the removal based on the given procedure (i.e., according to the largest TCC 
and TBCC values or random pick). 


Dataset 

Prolonged 

TCC 

Disconnected 

TBCC 

Prolonged Disconnected 

Random 

Prolonged Disconnected 

Infectious 

0.013 

0.001 

0.014 

0.232 

0.010 

0.001 

HT09 

0.082 

0.001 

0.264 

0.069 

0.031 

0.007 

Hospital 

0.049 

0.001 

0.156 

0.257 

0.037 

0.001 

Irvine 

0.014 

0.003 

0.006 

0.508 

0.018 

0.012 

Email 

0.136 

0.006 

0.375 

0.016 

0.054 

0.000 


and visualize this property. We believe that the centrality 
notions we proposed are useful for further studying the 
structure of temporal networks and verifying generative 
models of temporal networks. 

Datasets used in the numerical experiments, Infec¬ 
tious, HT09, and Hospital were originally collected and 
published by the SocioPatterns collaboration (http:// 
www.sociopatterns.org/). Datasets HT09 and Hospi¬ 
tal were downloaded from the SocioPatterns website. 
Datasets Infectious, Irvine, and Email were downloaded 
from the Koblenz Network Collection (http: //konect. 
uni-koblenz.de/). The authors thank Dr. James 
Cheng for valuable discussions. Yuichi Yoshida is sup¬ 
ported by JSPS Grant-in-Aid for Young Scientists (B) 
(No. 26730009), MEXT Grant-in-Aid for Scientific Re¬ 
search on Innovative Areas (24106003), and JST, ERATO, 
Kawarabayashi Large Graph Project. T.T., Y. Yano and 
Y. Yoshida designed the research. Y. Yoshida constructed 
the algorithms to compute the centralities and gave the 
proof of their computational complexity. Y. Yano imple¬ 
mented the algorithms. T.T. analyzed the data sets. Y. 
Yano performed the numerical experiments of the removal 
of temporal vertices. T.T., Y. Yano, and Y. Yoshida dis¬ 
cussed all the results and wrote the manuscript. 

A Computational complexity of calculating 
Teat and Tidt with the reachability oracle 

With the aid of the reachability oracle, we can efficiently 
compute Teat and ridt: 

Lemma 3 Let G be a temporal network and G be its 
DAG representation. We can compute Teat and ridt with 
0 (log|E|) queries to the reachability oracle ofG. 

Proof We only consider Teat as ridt can be computed sim¬ 
ilarly. Given temporal vertex v and vertex w, Tf.a,t{v, w) is 
the minimum r G M such that there is a temporal path 
from V to (w,t). To find such r, we perform a binary 
search using the reachability oracle. Since the number of 
possible values for r is 0{\E\), the number of queries is 
0{\og\E\). 

Lemma 4 Let G be a temporal network and G be its DAG 
representation. For any temporal vertex v, we can com¬ 
pute the TCC and TBCC values of v with 0(|Hp log |i?|) 
queries to the reachability oracle of G. 


Algorithm 3 (Approximation to the TGG value of v) 
1: r 0. 

2: for i = 1 to fc ^ log(2|Ep) do 
3: Sample vertices u,w £V uniformly. 

4: It -S- (M,Tldt(v,M)). 

5: m-s-(w,Teat(u, w)). 

6: if Tea.t{u,w) — w and Tidt{w,u) — u then 

7: r r -\- 1. 

return r/k. 


Proof The proof is immediate from Lemma 3 and the al¬ 
gorithm definitions of TCC (Algorithm 1) and TBCC (Al¬ 
gorithm 2). 

B Approximate computation of temporal 
coverage centralities 

By Lemma 4 (see Section 4), the number of queries to 
the reachability oracle for computing the TCC and TBCC 
values is (almost) quadratic in the number of vertices of a 
temporal network. However, in some applications, we may 
want to compute these centralities faster. Here, we intro¬ 
duce a standard technique that enables us to approximate 
these centrality values with a sublinear number of queries. 
We only explain the case of TCC; the case of TBCC is 
performed in a similar way. 

Algorithm 3 is an approximate method for computing 
the centrality value. The difference from Algorithm 1 is 
that, instead of enumerating all pairs {u, w), we only sam¬ 
ple 0(l/e^) pairs of vertices and take the average over 
them, where e is the parameter controlling the possible 
error in approximation. 

To show that Algorithm 3 gives a good approximation, 
we need to recall Hoeffding’s inequality: 

Lemma 5 (Hoeffding’s inequality [50]) 

Let Xi,X 2 ,... ,Xk be independent random variables in 
[0,1] and X = (1/fc) Then, for any positive real 

number t, 

Pr[|A-E[X]| >t]< 2eyip{-2i^k). 

Lemma 6 Let G be a temporal network and G be its 
DAG representation. For any temporal vertex v, with 
0(log |H|/e^) queries to the reachability oracle of G, we 
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can compute the TCC value of v with additive error of e 
with probability of at least 1 — l/|yp. 

Proof Consider Algorithm 3 and let C{v) denote its out¬ 
put. Algorithm 3 issues 0(log^ |y|/e^) queries since ridt 
and Teat can be computed with 0(log|C|) queries (see 
Lemma 3). Let Xi be the temporal edge at which we incre¬ 
ment r in the z-th loop and X = (1/fc) Note that 

E[C(r;)] = E[A:] = (1/A:) E[^i] = C{v), where C{v) 

is the TCC value of v. Since Xi,X 2 , ■ ■ ■ ,Xk are indepen¬ 
dent random variables in [0,1], by Lemma 5, we have 

Pr[|C(u) - C{v)\ > e] = Pi[\X - C{v)\ > e] 

< 2 exp(- 2 e 2 ± log(2|Cn) = 2 exp(- log(2|Cn) 

_ 1 
" w 

Hence, the lemma holds. 

Recalling that the query time of the reachability oracle 
is tiny, we find that the running time of Algorithms 3 can 
be seen as polylogarithmic in the input size. This is the 
great advantage of TCC and TBCC against other central¬ 
ity notions. 
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Fig. SI. Average (solid line) and 10 — 90% values (shaded areas) of TCC at each time for (a) Infectious, (b) HT09, (c) Hospital, 
(d) Irvine, and (e) Email. We consider only the temporal vertices involved in temporal edges with other vertices to calculate 
the statistics. For (d) and (e), we smoothed the curves by taking the average over a sliding window with a length of 100 units 
of time, because the time resolutions of the observations are so high that there are not sufficient number of temporal vertices 
to take the average at most of the time points. 



































































